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Battlefield commanders are now _ requesting real-time visual battlefield 
information. These requests place an enormous Strain on current transmission resources 
due to the file size of the images. As more and more visual information is sent, the 
ability to compress images efficiently becomes a significant issue. This thesis 
investigates whether any of the new image compression algorithms (Radiant TIN, Titan 
ICE, or Low Bit Rate) achieve higher compression ratios than the National Imagery 
Transmission Format Standard currently used by the Department of Defense. Titan ICE 
was found to perform better then Radiant TIN; however, the difference 1s not statistically 
significant. The Navy already has the proprietary rights to Radiant TIN. Therefore, in 
the absence of statistical significance, Radiant TIN is the recommended image 


compression algorithm for future use by the Department of Defense. 
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EXECUTIVE SUMMARY 


Communications or lack of communications between battlefield commanders and 
their subordinates plays an essential role in determining the outcome of many battles 
throughout history. As technology advances, battlefield communication evolves, and the 
information that battlefield commanders require to make decisions increases. 

The Chairman of the Joint Chief of Staffs issued two concept papers, Joint Vision 
2010 (JV2010) - A Vision for the Future and EJV2010, Concept for Future Joint 
Operations, Expanding Joint Vision 2010. JV2010 looks at improvements in 
communications and how they can be incorporated into the military to improve its overall 
capabilities. JV2010 and EJV2010 emphasize the extreme importance of information and 
the ability to access that information on the battlefield in the future of war fighting. 

The realization of JV2010 and EJV2010 requires that each military unit 
commander possess near real-time information on all activities within his region of 
responsibility. JWY2010 combines the capabilities of "Intelligence, Surveillance, and 
Reconnaissance (ISR)" and "Command, Control, Communication, and Computers (C4)" 
to acquire and assimilate the information needed to neutralize adversarial forces and 
effectively employ friendly forces. The task of distributing data is complicated by 
numerous factors including, but not limited to, time, non-homogeneous equipment 
(ranging from mainframes to hand-held computers), hostile atmospheric conditions, and 
finite bandwidth capacity. (JCS, 1996) 

JV2010 and EJV2010 lay the groundwork for the development of communication 


systems to improve the ability of commanders to make timely and informed decisions on 


XVii 


the battlefield. The foundation for these new forces is the ability to communicate quickly 
and efficiently to achieve information superiority. Information superiority is the ability 
to collect, process, and disseminate an uninterrupted flow of information while exploiting 
or denying an adversary’s ability to do the same. (JCS, 1997) Failure to achieve 
Information Superiority (IS) puts both the goals of JV2010 and EJV2010, and the future 
of the Services at risk. 

In order to achieve information superiority, the time required to transfer 
information to the battlefield must be minimized. The use of digital imagery increases 
transfer times and reduces the efficiency of communication systems. Image compression 
algorithms are required to reduce the storage size of digital images and reduce the time 
required to transmit these images. The three compression algorithms examined in this 
thesis are designed to improve the ability of today’s communication systems to transfer 
large digital images. 

Lossy compression algorithms significantly reduce the transmission times and 
storage space required for digital imagery. This reduction of image file size is necessary 
to meet increasing imagery requirements without upgrading current systems throughout 
the military. Additionally, the military does not have a standard format for tactical 
imagery. The standardization of the image format will allow for seamless transfers of 
imagery during joint operations. Currently, the Director of Space Information Warfare 
Command and Control is evaluating these three lossy algorithms Titian ICE (ICE), 
Interim Low Bit Rate (LBR), and Radiant TIN (RTN), to replace the current algorithm 


being used by the Navy. All three of these new algorithms can achieve compression 
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ratios in excess of 60 times the maximum compression ratio of the algorithm currently 
used by the Navy. The Navy has proprietary nghts to the code for RTN; therefore, 
selecting either ICE or LBR will require additional funding. This thesis will help N6 to 
determine if RTN should be selected as the Navy’s new compression algorithm, or 
ultimately the DoD’s. 

Four experiments compare the performance of the three tested algorithms. 
Reaction time and accuracy data is collected for target detection and identification testing 
using simple and complex background images. Additionally, pairwise subjective 
comparisons of image quality are collected. The simple background images consist of 
U.S. Navy ships at sea. The complex background images are of automobiles parked in a 
wooded area. The automobiles are partially occluded by the foliage to reduce the amount 
of information about the automobile in the image. 

The algorithm and compression ratio does not affect the identification of ships in 
a simple background. There are large individual differences in the respective subjects’ 
abilities to identify the ships. In attempting to identify ships, an increase towards greater 
accuracy could be gained by increasing training of the subjects. The identification of cars 
in the complex background shows that ICE performs best followed by RTN then LBR. 
However, there is no statistical significant difference between ICE and RTN. RTN 
consistently performs better than LBR in target detection. It also performs better than 
LBR in the identification and subjective rankings of both the complex and simple images. 


There is no statistical significant difference in reaction times based on compression 
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algorithms. The only result consistent throughout the reaction time testing is that as the 
compression ratios increase the subjects’ reaction times slowed. 

The results of subjective rankings of image quality are consistent across both the 
simple and complex background test images. ICE 1s the algorithm subjectively preferred 
over either of the other algorithms at all compression ratios. RTN 1s consistently 
preferred over LBR with the exception of the lowest compression ratios. This effect is 
also observed in the accuracy testing. The difference is LBR’s ability to compress 
images with less noticeable ee at the lowest compression ratios and thus does not 
invalidate the overall preference of RTN. 

ICE’s overall performance is better than RTN; however, the difference between 
the two algorithms is not statistically significant. Additionally, the ICE compression 
software is limited by its graphical user interface of compression ratios of 100 to 1; RTN 
does not have this limitation. Furthermore, the Navy already has the proprietary rights to 
RTN. RTN is the recommended compression algorithm. Any future testing should 
include the NITF 2.0 standard that was released at the completion of this study, and ICE 


should be reevaluated if the 100 to 1 software limitation is removed. 
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Il. INTRODUCTION 


Communications or lack of communications between battlefield commanders and 
their subordinates have played an essential role in determining the outcome of many 
battles throughout history. As technology advances, battlefield communication evolves, 
and the information that battlefield commanders require to make decisions increases. 
Commanders are no longer content with a written synopsis of an area, but want to see 
near real-time imagery of the area. Due to the large amount of data required to encode an 
image and the limited speed at which data can be transferred with the current 
communication systems; image compression algorithms are required. This thesis 1s 
designed to help the Director of Space Information Warfare Command and Control (N6), 
compare three such image compression algorithms. 

Early communications were limited to line of sight, using such methods as smoke 
signals, torches, flashing light, and semaphore flags (Holzman & Pehrson, 1995). In the 
1790s, Napoleon used an optical telegraph, developed in 1793, to communicate with his 
commanders in the field. These optical telegraphs were stationed on hilltops throughout 
France and used articulated arms to encode messages and transmit them more than 5,000 
kilometers. (National Academy of Sciences [NAS], 1997) The optical telegraph's 
limitation to line of sight was removed with the invention of the electric telegraph in 
1844, which could transmit beyond visible distances (Bray, 1995). 

Radio began to replace the electric telegraph when Guglielmo Marconi 


demonstrated in 1895 that radio can be detected over great distances (Masini, 1996). In 


World War I, radio played an important role when both sides cut iclesmapiie cables to 
disrupt the flow of communications. Both the German and the British militaries used 
radios to communicate during World War I. In 1940 during World War II, the first hand- 
held radio was issued to the troops. This allowed mobile units to coordinate over large 
areas and revolutionized warfare. In 1957, the Soviet Union launched Sputnik, 
demonstrating the capability of space based communication systems. The first U.S. 
military satellite was launched in 1966 by the U.S. Air Force and was capable of both 
digital voice and data communications (NAS, 1997). The U.S. military continues to 
improve upon these original satellites and places new and more powerful satellites in 
orbit, improving communications on the battlefield. 

In July of 1996, The Chairman of the Joint Chief of Staff's issued a concept 
paper, Joint Vision 2010 (JV2010) - A Vision for the Future. JV2010 looks at 
improvements in communications and how they can be incorporated into the military to 
improve its overall capabilities. In May of 1997, to further define the direction in which 
the armed forces should focus their developments and plans for the future, the Joint 
Chiefs issued EJV2010, Concept for Future Joint Operations, Expanding Joint Vision 
2010. JV2010 and EJV2010 emphasize the extreme importance of information and the 
ability to access that information on the battlefield in the future of war fighting. The 
information by itself is insufficient; soldiers on the battlefield must have the ability to 
obtain the required information and have it when they need it. (Joint Chiefs of Staff 


[JCS], 1996) (ICS, 1997) 


The Navy, along with the other services, is adjusting force structures toward a 
network centric battle force, reliant on the concept of having real-time battlefield and 
situational awareness through informational awareness. Wireless communication is the 
enabling technology allowing network-centric warfare to be achieved. According to Sun 
Tzu, if you know the enemy and know yourself, you need not fear the result of a hundred 
battles. If you know yourself but not the enemy, for every victory gained you will also 
suffer a defeat. (Handel, 1992) 

The realization of JV2010 and EJV2010 requires that each military unit 
commander possess near real-time information on all activities within his region of 
responsibility. JY2010 combines the capabilities of "Intelligence, Surveillance, and 
Reconnaissance (ISR)" and "Command, Control, Communication, and Computers (C4)" 
to acquire and assimilate the information needed to neutralize adversarial forces and 
effectively employ friendly forces. The task of distributing data is complicated by 
numerous factors including, but not limited to, time, non-homogeneous equipment 
(ranging from mainframes to hand-held computers), hostile atmospheric conditions, and 
finite bandwidth capacity. (JCS, 1996) 

Briggs and Goldberg (1995) demonstrate in their study the need for timely and 
accurate information. The study looks at military operations in the Persian Gulf during 
Desert Storm. Their findings show that 35 of 148, or 24%, of the U.S. casualties and 72 
of 467, or 15%, of U.S. injuries resulted from misidentification “friendly fire.”” American 
forces destroyed 27 of 35, nearly 80%, of the U.S. M1 Abrams tanks and Bradley 


Fighting Vehicles. This is significant given that Iraqi canon fire could not penetrate 


American tanks. The difficulty in identifying friendly forces is not new to the electronic 
battlefield, but the consequences are deadlier and faster. Clausewitz states that the 
difficulty of accurate recognition constitutes one of the most serious sources of friction in 
war (Handel, 1992). 

As the battlefield becomes more complex, and decisions concerning deadly force 
are made at an ever-increasing rate, information about both one’s own forces and the 
enemy’s 1s needed to prevent “friendly fire.” The speed at which the commander in the 
field receives the data can be the difference between destroying a “friendly tank” or being 
shot by the enemy while waiting for the information. The extreme cost of incorrectly 
identifying a target strongly influences the decision criteria of a commander on the 
battlefield. In a tactical situation, there is a strong bias to identify a vehicle as a foe, 
given any doubt. (Briggs & Goldberg, 1995) 

JV2010 and EJV2010 lay the groundwork for the development of communication 
systems to improve the ability of commanders to make timely and informed decisions on 
the battlefield. The foundation for these new forces is the ability to communicate quickly 
and efficiently to achieve information superiority. Information superiority is the ability 
to collect, process, and disseminate an uninterrupted flow of information while exploiting 
or denying an adversary’s ability to do the same. (JCS, 1997) Failure to achieve 
Information Superiority (IS) puts both the goals of JV2010 and EJV2010, and the future 
of the Services at risk. 

In order to achieve information superiority, the time required to transfer 


information to the battlefield must be minimized. The use of digital imagery increases 


transfer times and reduces the efficiency of communication systems. Image compression 
algorithms are required to reduce the storage size of digital images and reduce the time 
required to transmit these images. The three compression algorithms examined in this 
thesis are designed to improve the ability of today's communication systems to transfer 
large digital images. 

Currently N6 is evaluating these three algorithms, Titan ICE (ICE), Low Bit Rate 
(LBR), and Radiant TIN (RTN), to replace the current algorithm being used by the Navy. 
All three of these new algorithms can achieve compression ratios in excess of 60 times 
the current maximum compression ratio of the algorithm used by the Navy. The Navy 
has proprietary rights to the code for RTN; therefore, selecting either ICE or LBR will 
require additional funding. This thesis will help N6 to determine if RTN should be 
selected as the Navy’s new compression algorithm and ultimately the Department of 
Defense’s (DoD). 

The following chapters of this thesis are presented in the following order. 
Chapter II covers the driving forces behind the need for image compression algorithms. 
This chapter starts by presenting a few of the collection resources available in today’s 
military followed by a brief history of the development of image compression algorithms. 
The two major categories of compression algorithms will be explained, concluding with 
the current DoD compression standard. Chapter III and Chapter IV describe the basic 
algorithms in each of the two major categories of compression algorithms. Chapter V 
describes the three algorithms tested in this thesis. The methods for testing these three 


algorithms are presented in Chapter VI, and the data analysis follows in Chapter VII. 


The results of the data analysis and conclusions of this thesis are discussed in Chapter 


VIL. 


II. BACKGROUND 


A. COLLECTION 

As technology and equipment continue to advance, digital information will be an 
integral part of operations in the battlefield. Although transmission equipment continues 
to improve and bandwidth capabilities are expected to be insignificant, these capabilities 
are also expected to fall short of transmission needs (Gonzalez et al, 1994). There are 
several methods and devices capable of imaging the battlefield and transmitting the 
information to a collection agency. Each of these sources fills a very specific need in the 
information-gathering arena, though there are advantages and disadvantages to using 
each. 

All the services are capable of collecting intelligence and imagery of one kind or 
another. The collection range of sophisticated imaging equipment is extensive, from 
equipment mounted on ships and land, to hand-held film and digital cameras. As 
computing power continues to increase, and the size of digital equipment continues to 
decrease, miniaturized, electronic-surveillance devices will increasingly be found on the 
battlefield. 

With the invention of lighter-than-air vehicles and airplanes, information 
collection has moved to non-terrestrial sources. Today's reconnaissance aircraft have the 
ability to be tasked in real-time to meet current objectives of the operational commander. 
Additionally, these aircraft have higher resolution sensors, and are able to monitor at a 


region for extended periods of time. The disadvantage of aircraft is the risk to both man 


and machine while flying over enemy-controlled areas. Aircraft are susceptible to being 
shot down or interdicted by the enemy during missions. Some of the aircraft currently in 
use are the U-2, the SR-71 Black Bird, the P-3 Orion, the TARPS equipped F-14 Tomcat, 
and Unmanned Aerial Vehicles (UAV). 

With the launch of Intelsat-II (Early Bird) into a geosynchronous orbit in 1965, 
the United States enterd into space-based communications. The DoD today has access to 
many space-based systems, both commercial and governmental. These sources include 
weather satellites, such as the Defense Meteorological Satellite Program, with the 
primary function of collecting weather data for operational forces. This satellite can 
provide real time tactical data, and circles the globe every 12 hours. Another is Landsat, 
a joint NASA/DoD platform capable of providing multi-spectral imagery. This system 
maps the surface of the earth, and spots man-made objects using both visible and infrared 
sensors. Finally, Lacrosse is a NASA system that uses Synthetic Aperture Radar orbiting 
in a low earth orbit. 

These space-based collection systems were key in the planning and execution of 
the Gulf War. The satellites provided real-time weather information to aid the 
commanders in mission decision-making. The information from the satellites supported 
everything from the movement of individual aircraft to entire armored units, through the 
launch, en route, target, and recovery phases of their missions. (Muolo, 1993) 

Some disadvantages of satellites include the problem of positioning the satellite 
above the intended target; whether or not the satellite can take pictures during daylight or 


nighttime; and how often the satellite is above the target. Additionally, the resolution of a 


Satellite image is usually not as defined as an image taken by an aircraft or a ground 
based sensor. Some advantages of satellites are that satellites are not vulnerable to being 
shot down by missiles; that satellites in geo-stationary orbits can stay above a target for 
an indefinite period of time; and that satellites provide images without placing pilots at 
risk. 

The need for compression software, specifically image compression software for 
today's collection sources, is twofold. The first 1s the large number of bits required to 
represent an image digitally. The second is the reduced ability to receive large streams of 
data rapidly. Most collection sources have ground-based control units capable of 
receiving the vast quantities of information being transmitted. The re-transmitted 
imagery desired by the local commanders 1s limited by “bandwidth”. The UAV for 
example has the capacity to transmit line of sight (LOS) to its Mission Control] Element 
(MCE) at 137Mbits/sec. The commander on the ground can only receive between 1.5- 


10Mbits/sec (Figure 1), with most users limited to 1.5Mbits/sec. (Waller, 1996) 
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Figure 1 LOS imagery transmission strategy 
When information has to be relayed via satellite, there 1s a reduction in the 
maximum capacity of data that can be transmitted to about 5OMbits/sec (Figure 2). A 
satellite relay is required any time the UAV is being operated beyond the LOS of the 
MCE. To overcome the limitations of bandwidth and reduce transmission time, it 1S 
mandatory to use methods to reduce the number of bits required to represent the required 


data in real time. (Waller, 1996) 
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Figure 2 SATCOM imagery transmission strategy 





The vast amount of digital data required to represent an image can best be 
demonstrated by transmitting a single image from LANDSAT. A single image 
comprised of approximately 6,000 X 6,000 pixels, each pixel composed of 8 bits, 
requires approximately 2.9 x 10° bits of data (Rabbani & Jones, 1991), or approximately 
280 Megabytes. The battlefield commanders would have to wait as long as 9 minutes for 
this single uncompressed fee The advantage of compression algorithms of 100 to | 
and above is that they can reduce the time required for the commanders to receive this 
LANDSAT information to less than six seconds. 

B. COMPRESSION BACKGROUND 
Traditional compression schemes manipulate signal image pixel values based on 


mathematical formulas, without regard to the way the final reproduced signal is seen by a 


human user. These mathematical functions are appropriate for some fits. such as 
measurements or text, but it fails to take advantage of the characteristics of the human 
visual system. For example, if greater compression can be achieved, and the associated 
image quality loss is not perceivable to the human eye, then more of the data can be 
removed. Compression methods that take advantage of the nature of these phenomena 
are referred to collectively as "perceptual coding.” 

Perceptual coding can be accomplished through a variety of means. It usually 
involves using models of human perception, such as a human visual-system model. 
Computer vision models are becoming increasingly sophisticated in their attempt to 
emulate the human visual system. These new computational models are leading to 
models that are more accurate for Just Noticeable Difference (JND) and noise-masking. 
Hardware improvements have also increased to the point where practical digital signal 
processors can support perceptual coding. (Jayant, Johnston, & Safranek, 1993) 

These models can be quite complex and their incorporation into compression 
algorithms is quite involved, requiring cooperation among psychologists, computer 
scientists, and engineers. The potential gains justify the development effort and have 
been estimated to yield 10-50% improvements in efficiency of compression, with no 
perceptual distortion. One approach is to transform the raw data, using a perceptual 
model, into features deemed important for perception. These features are then explicitly 
compressed and used to reconstruct the signal. Another approach is to incorporate 
perceptual knowledge into the computation of measurements of distortion and fidelity. 


These data are then used to produce computer code that will represent the image. 
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Regardless of the specific method, sensible incorporation of human perception is likely 
both to provide substantial improvements in compression performance, and to do so 
without significantly influencing tactical recognition. 

Moeller and Hurlbert (1997) published a well-worded discussion of an important 
factor of target recognition. They discussed image segmentation as follows: 

Recognizing and locating objects are fundamental tasks of the human 

visual system — but to determine ‘what’ is ‘where’, the visual system must 

first segment the image into regions likely to correspond to distinct 

objects. It is generally assumed that image segmentation, in which similar 

regions are grouped together and segregated from dissimilar regions, 

occurs at an early, preattentive level of visual processing (Moeller and 

Hurlbert, 1997, pp. 106). 

This 1s an area where image coding can significantly influence recognition. At 
high levels of compression, most coding algorithms introduce distortion. This distortion 
appears in the form of blocked regions of reduced contrast and resolution. These regions 
may disrupt the preattentive level of visual processing and increase the probability of a 
false recognition or a miss. 

The study of “texture discrimination” is also closely related to image 
compression. Texture is the quality of a surface that gives the observer the feeling of a 
uniformly colored area caused by quasiperiodic repetitions of some patterns. (Gonzalez 
et al, 1994) Texture-discrimination task performances depend on background noise. It is 


harder to find a texture with a noisy background than it is to find one without a noisy 
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background. Identical orientations of background noise and texture reduce performance. 
A simulation model for human texture-discrimination tasks shows that asymmetry, and 
the researchers concluded that the ability to differentiate texture from background 
increases with the variability in their orientation. (Caputo, 1996; Rubenstein & Sagi, 
1990) 

An image compression algorithm that does not consider the physiology of the 
visual system can mask targets. Masking occurs when distortion reaches levels that begin 
to blend foreground and sae objects, thus decreasing texture variability. 

C. CATEGORIES OF COMPRESSION 

Image compression is divided into two major categories, “lossy” and “lossless”. 
A lossless compression algorithm is one that guarantees that its decompressed output is 
bit-for-bit identical to the original input. This is a much stronger claim than “visually 
indistinguishable from the original”. Lossy algorithms reduce the unnecessary 
redundancy the human eye is unable to perceive anyway. For most photo-like images, a 
large amount of data can be removed, while still maintaining most of the original visual 


features (Figure 3). (Beser, 1994) 
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Figure 3 Relationship between lossless and lossy compression 
All image compression algorithms consist of two basic components, the encoder 
and the decoder (Figure 4). If the encoder retains all the information from the original 
image, the algorithm is lossless. Conversely, if it discards information to improve the 


compression ratio, it 1s a lossy algorithm. 


Compressed Reconstructed 
| Encoder ~| Decoder ee 


Image 
Data 





Figure 4 Generic Image compression System 
Lossy image compression techniques add an additional step when compressing an image. 
The additional step is quantization, where a reduced number of bits represents the 


transformed data (Figure 5). (Sanford, 1995) 
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Figure 5 Generic Image Lossy Algorithm 


Lossless algorithms are limited to a compression ratio bounded by the entropy of 
the image. This lower limit can be calculated from the number of bits that would be 
required to encode each pixel. The compression ratio can then be calculated by dividing 
the number of bits required to code all possible values of a pixel by the entropy of the 
image. For example, a gray scale image using eight bits to code each pixel (0 to 255) and 
an entropy of two bits per pixel, the maximum compression ratio attainable using a 
lossless compression algorithm, would be four to one. (Rabbani & Jones, 1991) 

The evaluation of lossy compression algorithms has been done predominantly 
with mathematical tools. The most common calculation is to compute the mean square 
error (MSE) which is the average squared error between the original value of a pixel and 
the compressed value of that pixel. The peak signal-to-noise ratio (pSNR) and the 
maximum error between the uncompressed pixel value and the compressed value are also 
used. (Reiter, 1996) These numeric performance measures do not adequately describe 
image quality. An image that has good visual image qualities may have a high numerical 
error such as MSE. Algorithms that operate on spatial frequencies that are not sensitive 
to the human visual system may produce a large error while still being visually 
indistinguishable from the original. (Brower, 1994) With increased compression, a shift 
is required from traditional mathematical evaluation techniques to perceptual 


psychophysical performance measures. 


D. CURRENT STANDARD: 

The current image compression algorithm in use by the United States Department 
of Defense is the National Imagery Transmission Format Standard (NITF), introduced in 
1987. NITF is designed to work on low-cost workstations with limited power and 
storage space to send and receive data. The system design also accommodates for poor 
transmission lines that may include noise. (Brower, 1994) The Chairman of the 
Committee on Imagery Requirements and Exploitation directed the adoption of NITF by 
the Intelligence Community as the standard for image transfer in May 1989 (National 
Imagery and Mapping Agency [NIMA], 1999). 

NITF is designed to transmit a file, composed of an image accompanied by sub- 
images, symbols, labels, text, and other information related to the image (NIMA, 1999). 
One main feature of NITF is that it allows for several items of each data type to be 


included in one file (Figure 6) (Paragon, 1999). 
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Figure 6 NITF File Format 





The file 1s submitted to the Message Transfer Facility (MXF), which allows it to be 
transferred using any of the user-selectable protocols and media. The current image 
compression algorithm being used is a low-bit rate compression algorithm. The original 


standard is based on Joint Photographic Experts Group (JPEG) compression with pre- 


and post-processing. The pre- and post-processing allow for compression ratios greater 
than 60, which is about the maximum achievable with standard JPEG. The “pre” and 
“post” are simple reduction routines, not part of a comprehensive compression algorithm. 
The inclusion of the added routines allows for the implementation of the system without 
developing a new standard.’ 

The following sections address different schemes for compressing images. The 
new algorithms being tested in this thesis are all derived from concepts and functions 
developed from these schemes. Lossless schemes will be discussed first, for they are 
implemented in the coding schemes of several of the lossy algorithms. The three 
algorithms being examined in this thesis are based on lossy schemes and will be 
discussed last. Any of the compression algorithms can be used by NITF due to the 


independence of NITF from the actual image format. 


' The MXF functionality of the NITF software is not germane to this study and will not be addressed. 
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II. LOSSLESS SCHEMES 


All of the following lossless algorithms are based on variable-length code words. 
The use of variable-length code words allows for more efficient coding. The efficiency is 
gained by using the minimum number of bytes for each value being represented. These 
lossless schemes are used as part of the coding process for the lossy algorithms. 

AS HUFFMAN 

The Huffman encoding scheme is based on the probabilities of pixel value 
combinations appearing in the image. The probabilities for each combination of pixel 
values are calculated. The smallest code word is then assigned to the pixel-value 
combination that has the largest probability of appearing. The next smallest is then 
assigned the next most probable pixel value. This process is repeated until all possible 
pixel-value combinations are assigned a code word. The only rules for coding are first, 
no two characters will consist of identical codes, and second, each code will be 
constructed such that no additional indication 1s necessary to specify where a code begins 
and ends once the starting point is known (Huffman, 1952). This general process has 
been modified in two ways to increase the efficiency of the coding scheme. 

The first modification is to reduce the code-word set by combining the least 
probable pixels into one code word. This is accomplished by using the code word for the 
combined pixel values followed by the value itself. If the probability of this combined 
code word is low, the overall savings to the complexity and storage requirements are 


decreased, without any significant decrease in the efficiency of the coding. 


The second modification is to eliminate the need to make two passes of the image, 
one to calculate the probabilities of each pixel value, the next to code the values. Either 
calculating the probabilities from a small part of the image or using a standard probability 
table based on typical images can eliminate the second scan. (Rabbani & Jones, 1991) 
Images with a high correlation will have the greatest compression ratios. 

B. RUNLENGTH 

The runlength coder is used to code the series of ones and zeros of a binary file. 
A symbol is used to indicate the start of a run followed by the number of instances. For 
example, if there are 92 consecutive occurrences of zero, the storage requirement is 184 
bytes (92 x 2 bytes/occurrences = 184 bytes). By coding these zeros, the storage 
requirement is now 2 bytes for a compression ratio of 92 to 1. This runlength 
compression can then be coded using Huffman coding to increase the compression ratio 
by as much as three to one. (Reiter, 1996) The simplest implementation of this 
algorithm codes each individual line in the image with its own Huffman code. Higher 
order implementations take into account the previous lines in the image to develop the 
Huffman code table for those lines. These higher order implementations are more 
efficient, taking into account the vertical correlation of the image. (Rabbani & Jones, 
roe) 

C. DIFFERENTIAL PULSE CODE MODULATION 

The Differential Pulse Code Modulation (DPCM) is a predictive algorithm. There 

is a great degree of correlation between adjacent pixels in most images. Predictive 


algorithms use this tendency to estimate the value of the next pixel, then only encode the 
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difference in the predicted value and the actual value. This value is usually smaller than 
the original pixel value, thus reducing the number of bits required to store the 
information. The number of pixels used to estimate the next pixel is the order of the 
predictor. As the order increases, the accuracy of the estimate for the next pixel 
increases. There is a diminishing return on increasing the order of the predictor. For 
efficiency, most do not go past the third order. (Rabbani & Jones, 1991) 

DPCM can be encoded either as a lossless or lossy algorithm. For this example, a 
third-order global seaicee is used to predict pixel values in the image. The difference 
between the predicted value of the pixel and the actual value of the pixel is then stored. 
A simple polynomial can use the pixels to the left, upper left, and above to predict the 
next value. A, B, and C are arranged around x, the predicted value, Table 1. 

Table 1 Coefficient Arrangement 
With the polynomial 0.75A-0.5B+0.75C, Table 2 can be converted into Table 3 with a 


difference Table 4. (Rabbani & Jones, 1991) 
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Table 3 Predicted Values of Pixels 
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Table 4 Difference in Pixel Values 

The distributions of coded values typically have a Laplacian Distribution with a 
mean of zero. The Laplacian Distribution of coded values also has a reduced variance 
increasing the ability to compress the image. (Figure 7) (Rabbani and Jones, 1991) If 
the images are to be coded as a lossy image, a quantizer is added to reduce the 
significance of each pixel value. By reducing the accuracy of the pixel values, a smaller 
range of numbers can be used to code these new values. With a smaller set of values, 
fewer bits can then be used to encode the remaining pixel values. This is the step where 
information about the image is lost. The coding of the prediction error rather than the 


data results in a higher quality reconstruction of the original image. (Venkataraman & 


Farrelle, 1994) 


Nuinber of pixels (« 10°) 





Figure 7 Differential Image Plot 


1p 


IV. LOSSY SCHEMES 


The algorithms tested in this thesis, RTN, ICE, and LBR, are lossy algorithms. 
Lossy algorithms can achieve higher compression ratios by removing some of the 
information in the reconstructed images. The following lossy algorithms are the building 
blocks for the next generation of compression schemes. 
A. TRANSFORM CODING 

Transform coding is applied to the images by first dividing the image into sub 
images or blocks. Each of these blocks then has the transform applied to it independent 
of all the other blocks. For example, the JPEG compression algorithm divides the image 
into 8X8 blocks. These 8X8 blocks are then transformed using a unitary transform. A 
unitary transform is a reversible linear transformation made of complete, orthonormal 
discrete-basis functions. (Rabbani & Jones, 1991) The unitary transform causes a 
reduction in the variance of the coefficient values. The remaining coefficients can then 
be coded in either a lossless or lossy algorithm. 

1. Coordinate Axes Rotations 

Coordinate axes rotation takes advantage of the fact that there is high enough 
correlation between adjacent pixels to reduce the variance in the pixel intensity values. 
The majority of the correlations lie along a 45° diagonal (Figure 8). To reduce the MSE 
when coding the image, each sub-block of the image is rotated about the axes. Let 


X=(yl,y2) and Y=(yl,y2) represent the location of a pixel before and after the rotation 
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2) AD |i) jee 


Zs 


transform is applied to each of the previously transformed _ blocks: 
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= — . (Rabbani & Jones, 1991) 
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Figure 8 Rotation Transform 


pep Basis Function Decompositions 

More generally, the above rotation is a transformation by a basis, Y=AX where A 
is any basis. The decoder would then be X=BY where B=A", and other basis can be 
used. These transformations are commonly used in discrete cosine transforms (discussed 
later), based on sines and cosines of different frequencies leading to spectral 
decomposition of the original image. (Rabbani & Jones, 1991) 

3. Discrete Cosine Transform 

The most common frequency Transformation is the Discrete Cosine 
Transformation (DCT). Let the horizontal and vertical indices of the transformed block 
be u and v; and the original horizontal and vertical indices of the block be j and k. F(u,v) 


is the pixel value at the position u, v in the transformed block and f(j,k) is the pixel value 
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l . a. 
at the position j,k in the original block. C(O) = —~, 1 otherwise. The DCT sub-divides 


WD) 


the image into small, square blocks, and then uses the following transformation on each 


block: 
F(u,v) = Swe 'S fG. Kooy CLe DH CL |e io ai 
j=0 k=0 


Table 5 is an example of an 8 x 8 block transformation using DCT, while Table 6 is the 
transformed block. The Inverse DCT is used to restore the original image and is defined 
as: 


(2k +l)vx 


2n 


f(UkK)= ne 1 (Rabbani & Jones, 1991) 
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Table 6 Transformed 8 x 8 Block 
JPEG image compression is an example of a DCT algorithm designed for 


compressing either full-color or gray-scale images of natural, real-world scenes. It works 
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well on photographs, naturalistic artwork, and similar maternal. It does a work as well 
on lettering, simple cartoons, or line drawings. JPEG is designed to exploit known 
limitations of the human eye, notably the fact that small color changes are perceived less 
accurately than small changes in brightness. 

Typically, gray-scale images do not compress by large factors. Because the 
human eye is much more sensitive to brightness variations than to hue variations, JPEG 
can compress hue data more heavily than brightness (gray-scale) data. A gray-scale 
JPEG file is generally only about 10%-25% smaller than a full-color JPEG file of similar 
visual quality. However, the uncompressed gray-scale data is only 8 bits/pixel, or one- 
third the size of the color data; therefore, the calculated compression ratio is much lower. 
The threshold of visible loss is often around five to one compression for gray-scale 
images. The exact threshold at which errors become visible depends on the viewing 
conditions. The smaller an individual pixel, the harder it is to see an error; therefore 
errors are more visible on a computer screen (at 70 or so dots/inch) than on a high-quality 
color printout (300 or more dots/inch). Thus, a higher-resolution image can tolerate more 
compression. (JPEG Image compression FAQ, 1998) 

4. Walsh-Hadamard Transform 

The Walsh-Hadamard Transform (WHT) is not as efficient as the DCT algorithm 
but has the advantage of being simple to implement. All of the coefficients in the WHT 
basis are either +1 or —1. The basis can be recursively formed to generate any square 


matrix. (Rabbani & Jones, 1991) 
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5: Symbolic 

Symbolic Transformation is designed to code images with a man-made object in 
them. The image is transformed into a symbolic representation of the entire scene instead 
of pixels. Objects that are well defined by this algorithm are edges, arcs, and changes in 
texture. The background is coded separately from the objects coded by symbols. By 
coding the detail and texture separately, the algorithm benefits from both vector- and 
texture-coding strong points. Symbolic Transformation is designed to increase the detail 
of the reconstructed image to maintain more edge information and enhance the 
compressed image. 

6. Subband 

These are frequency-sensitive compression algorithms. Each image is divided 
into separate images containing only specific frequency data. This division 1s 
accomplished with the use of high and low-band filters. The advantage to dividing the 
image into these separate subbands is that the different frequencies can be coded so that 
those that are most noticeable to the eye are not compressed as highly. Uniformly scaling 
the JND threshold controls the step sizes of the quantizer to control compression and/or 
perceptual quality. (Gonzalez et al, 1994) 

An example of a subband algorithm is a wavelet transformation. A wavelet 
transformation is a set of repeated one-dimensional high- and low-pass filters. The two 
equations that make up the filters are composed of a scaling function and a wavelet 
function. Wavelet compression can provide between 1.5 and 4 times better performance 


for a given image quality compared to JPEG. (Reiter, 1996) 
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One of the common outcomes of compression algorithms is a tiling or blocking 
effect in a compressed image. Many algorithms first divide the image into regions or 
blocks and then execute the compression routines on each block, causing the tiling effect. 

An alternative method of compressing images at higher compression ratios is to 
use hierarchical coding, employing wavelet transformations. Typically, these 
reconstructed images, compressed using wavelets, do not exhibit the tile effect. In 
wavelet coders the image is usually recursively decomposed into several subbands which 
are then quantized. Optimal quantization for each of the subbands is not trivial in linear- 
phase wavelets since the subbands are not mutually orthogonal. (Venkataraman & 
Farrelle, 1994) 

B. QUANTIZATION 

Quantization is the process of reducing the amount of data used to represent the 
image. Setting a threshold and removing all values that fail to meet the minimum 
required value is one method. An alternate method is to reduce the number of different 
values used in the data. By binning the data the number of unique pixel intensities is 
reduced. 

1. Lloyd-Max 

Lloyd-Max Quantizer is a staircase function that first partitions input pixel values 
into N intervals with boundaries dj,...,dy. This quantizer then maps the original values 
into discrete values termed reconstruction levels 1,...,r~. The values are derived to 
minimize the expected square error D between the original pixel values and the 


transformed pixels: 
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N-14i01 
D=> {(e-7)’p.(e)de, (4.1) 


i=0 j=0 
where p-(e) the distribution of the pixel values to be quantized. (Rabbani & Jones, 1991) 
The solution to the equation (4.1) will yield decision levels “halfway between the 
neighboring reconstruction levels and reconstruction levels that lie at the center of the 
mass of the probability density enclosed by the two adjacent decision levels” (Rabbani 


and Jones, 1991, pp. 84), in particular 
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There is no closed form solution to equation 4.1. Numerical techniques must be used to 
estimate the solution. When the pixel data is successfully transformed, the transformed 
data now has a Laplacian density. A 3-bit quantizer can be used with the parameters 


listed in Table 7 and shown in Figure 9. (Rabbani & Jones, 1991) 
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Table 7 Typical Eight-Level Lloyd-Max Quantizer Distrabution 
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Figure 9 Typical Eight level Lloyd-Max Quantizer Distrubution 

pp Vector Quantization 

Vector Quantization (VQ) is a process for converting the image into vectors much 
like a symbolic algorithm. The symbols or vectors come from a ‘codebook’ of vectors, 
this codebook is generally developed by ‘training.’ Training of the codebook is 
accomplished by examining images similar to the images to be compressed. The more 
representative of the images the codebook is, the better the algorithm will perform. 
(Rabbani & Jones, 1991) 

The image is divided into vectors, and then these vectors are coded from the 
codebook. A code and a location of the start of the vector in the original image now 
represent each vector in the original image. If the image does not use a predefined 
codebook, the codebook must be sent in addition to the compressed image. The VQ 
coders are theoretically optimal from the information point of view. However, as 


compression ratios are increased beyond one bit per pixel (bpp), the tile effect and 
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blocking artifacts are again observed in the reconstructed image. (Venkataraman & 
Farrelle, 1994) Vector quantization is the extreme case of symbolic transformation, but 
the textual information is not coded separately; it is all coded as edges. VQ is used for 
computer auto-identification. This algorithm is not used in any of the tested algorithms. 

5: Adaptive 

Most of the algorithms listed can be adaptive. The process of making them 
adaptive requires an additional pass through the image to optimize the coding of the 
coefficients. Adaptive algorithms attempt to eliminate the artifacts produced by most 
compression algorithms at high compression ratios. The most common artifact is tiling 
which 1s caused by the image being divided into blocks and then coded independently of 
the surrounding blocks. Adaptive DPCM reduces these effects by using the surrounding 
blocks as inputs for compressing each block. Frequency coding such as vector 
quantizations, also develops artifacts at high compression ratios, as previously stated. 

An example of an adaptive subband coder is the Adaptive Perpetual Image Coder 
(APIC). With locally adaptive perpetual quantization it minimizes the perceptual 
distortion of the image. The quantization is based on an estimate of the amount of 


masking available for each subband coefficient. (Figure 10) (Gonzalez et al, 1994) 
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Figure 10 Typical APIC coder 
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V. ALGORITHMS TESTED 


A. RADIANT TIN 

RTN is a system of software tools Febigned to make imagery available at all 
levels of the military force-structure using low-bandwidth channels. The project's goals 
are to use existing hardware; provide high-ratio image compression with low file 
overhead; minimize or eliminate the need for added equipment; and to provide a simple, 
graphical user-interface. The RTN process uses a multi-step process to improve 
compression performance. 

The first step in the compression process is to define the edges in the image. Each 
of the edges is coded into a vector consisting of a starting point and a path to the end of 
the edge. The second step is to compare the textures on either side of each of the edge 
vectors. The texture information allows for a texture gradient across each edge. 

The information about the image is now coded using two different methods. The 
edge information is coded using a symbolic transform method. The remaining texture 
area is coded using a wavelet transform. The second method transforms spatial domain 
information using a residual error approach (Figure 11) (Beser, 1994). The Radiant TIN 
algorithm uses either symbolic decomposition or the frequency decomposition to achieve 


compression values more than 1 to 100 (Figure 12) (ISOA, 1995). 
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Figure 11 Radiant-Tin Spatial Transformation Flow Diagram 
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Figure 12 Radiant TIN Compression Algorithm 
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Sanford (1995) tests RTN against JPEG, Yuval Fisher’s Fractal Compression 
Program, and Aware Corporation Wavelet compression algorithms. He concludes that 
JPEG does well at low compression ratios, while RTN does well at high compression 
ratios. Sanford’s thesis employs both quantitative and qualitative testing for a difference 
in the algorithms. The qualitative evaluation consisted of a self-paced paired comparison 
test. The subjects’ task was to choose the best image from the pair. Subjects rated the 
Wavelet image highest, followed by RTN. 

B. INTERIM LOW BIT RATE 

LBR is a command-line compression algorithm based on the JPEG compression 
engine. LBR can increase the achievable compression ratios by downsampling the image 
before compressing it with a JPEG encoder. In the reconstruction phase, LBR upsamples 
the image after using the JPEG decoder (Figure 13). Downsampling the image causes the 
image to be blurred and introduces aliasing, while the JPEG encoder causes blocking at 
high compression ratios. LBR reduces these effects by adjusting the relative compression 


contributions from the two components. (Lan & Reitz, 1996) 
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Figure 13 LBR System 
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The downsampling is based on the combination of a Pemecssccongnddus- 
spatial (D/C) conversion and an anti-aliasing filter. These filters work as one filter by 
combining with each other through multiplication. The downsampling is performed on 
the rows of the image and then again on the columns of the image from the first 
downsampling. The upsampling is performed by first applying a D/C conversion to the 
compressed image and then resampling at a higher rate. This upsampling is performed in 
the reverse order of rows and columns as the downsampling is applied. The order of 
compressing the rows and columns may be reversed for software optimization. (Lan & 
Reitz, 1996) 

Cc: TITAN ICE 

Titan ICE is based on a combination of wavelet and subband transformations. 
The algorithm is designed to compress radiological and other high-resolution digital 
imagery. These images are compressed to 30 to 1 while still maintaining sufficient 
quality to allow diagnostic readings (Reiter, 1996). The algorithm uses the JND 
threshold to adjust the compression equations to optimize the compression ratio at any 
compression quality. ICE uses the same algorithms for compression, but exceeds the 
JND threshold for lower quality levels in order to achieve the higher compression ratios. 
By coding each of the subbands independently with the wavelet transformation, ICE is 
able to reduce the perceptual errors introduced during coding. 

ICE is a two-channel algorithm, which applies a low-pass filter of the rows of the 
image and then discards every other sample in the row. It then applies the high-pass filter 


on the rows of the image and discards every other sample. This process of filtering and 
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then downsampling is then applied to the columns of the image. The low-pass filter 
extracts the trends, wheras the high-pass filter extracts the detail. This is a single-level 
wavelet transform. This process may be repeated on any subband, but is usually done 
only on subbands that are formed from applying the filters and downsampling to both 
rows and columns. (Reiter, 1996) 

Reiter using pSNR and Max error has done some comparative work between 
JPEG and Titan Ice. These results were also compared to those of qualitative 
measurements by human observers. Reiter acknowledges the need for perceptual 
measurements, but presented the mathematical results as a relative comparison. In the 
pSNR-to-compression graph (Figure 14), and the Max-error-to-compression-ratio graph 
(Figure 15), wavelet compression outperformed JPEG. In the perceptual testing subjects 
preferred the wavelet algorithm to JPEG even with identical pSNR. (Reiter, 1996) 

Experiments using ICE and JPEG revealed that ICE has one half as many numeric 
errors as the JPEG compressed images. Additional experiments used the two algorithms 
to compress medical x-ray images for a subjective comparison. The compressed images 
were then examined by a radiologist who preferred ICE to JPEG. The medical image test 
concentrated on JND. (Reiter, 1996) The point at which each algorithm passes the JND 


threshold is a good indication of how each will perform at higher compression ratios. 
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D. PURPOSE AND RATIONALE 
Lossy compression algorithms significantly reduce the transmission times and 
storage space required for digital imagery. This reduction of image file size 1s necessary 


to meet increasing imagery requirements without upgrading current systems throughout 
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the military. Additionally, the military does not have a standard Sahat for tactical 
imagery. The standardization of the image format will allow for seamless transfers of 
imagery during joint operations. Currently, N6 is evaluating these three lossy algorithms 
ICE, LBR, and RTN, to replace the current algorithm being used by the Navy. All three 
of these new algorithms can achieve compression ratios in excess of 60 times the 
maximum compression ratio of the algorithm currently used by the Navy. The Navy has 
proprietary rights to the code for RTN; therefore, selecting either ICE or LBR will 
require additional funding. This thesis will help N6 to determine if RTN should be 
selected as the Navy’s new compression algorithm, or ultimately the DoD’s. 

Ultimately, this thesis will provide information to the DoD for procuring image 
compression software for the 21“ century. Specifically, the thesis will compare the three 
lossy algorithms LBR, ICE, and RTN to the current compression standard currently used 
by the DoD. Finally, if ICE or LBR are judged better then RTN, determine if they are 


significantly better than RTN. 
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VI. METHODS 


A. GENERAL METHODS 

Four experiments are conducted to compare the performance of three algorithms. 
Reaction time and accuracy is collected for target detection and identification testing 
using both simple and complex background images. Additionally, pairwise subjective 
comparisons of image quality are collected. 
B. DETECTION 

I, Participants 

Five volunteer subjects for this experiment, consisted of three men and two 
women. All possess at least 20/20 corrected vision. The average subject age is 31 witha 
standard deviation of three. Subjects are naive to the purpose of the experiment and none 
have participated in previous visual search experiments. 

Zz Apparatus 

The experimental workstation consists of a Pentium 200 MHz personal computer 
equipped with a Texas Instruments TMS340 Video Board and the corresponding TIGA 
Interface to Vision Research Graphics (VRG) software. The stimuli are presented on an 
IDEK MF-8521 High Resolution color monitor (21” X 20” viewable area) equipped with 
an anti-reflection, non-glare, P-22 short persistence CRT. Pixel size is .26’ horizontal by 
.28’ vertical, 800 X 600 square pixel resolution and the frame rate is 98.9 Hz. Brightness 
of the monitor is linearized by means of an 8-bit look-up table (LUT) for the red, blue, 


and green guns. Responses are recorded on the number pad of a standard computer 
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keyboard. The monitor and keyboard are placed on separate desks with a black cloth 


draped over both to prevent surface glare. Mesopic viewing conditions are maintained 


using a small floor lamp (6.8 cd/m’ luminance) placed on the floor behind the IDEK 
monitor. A chair and a chin rest (both adjustable) are provided for subject comfort and to 
help maintain the appropriate distance and viewing angle. 

3. Stimuli 

The images consist of 10 background scenes of rural and urban settings, with a 
target located in one of three regions (center, right, or left). The distracter images use the 
same background scenes, but with no target. Each of the images are compressed by the 
Applied Physics Laboratory’s image coding division, utilizing RTN and LBR, at five 
image qualities. The target is a man Standing or sitting in plain view. The images are 
then cropped to a square 460 X 460 pixel size in order to simulate output devices 
commonly found in military applications. The net result of this process is 600 images, 
300 images with targets (10 scenes x 3 target locations x 5 compression ratios x 2 
algorithms) and 300 distracter images. 

After manipulation, all stimuli are converted to 8-bit, indexed color, IBM 
compatible image files for interface with the experimental hardware and software. The 
mean luminance of the images presented is 5.7 cd/m2. Due to the test equipment 
limitation of 256 colors in PCX format, the images are converted from the 24bit-gray 
scale to 8bit-gray scale. Image Alchemy is used to do this using a Floyd-Steinberg 
dithering process. Due to limitations in the LUT of the Vision VRG Software, “noise” is 


introduced in the upper regions of some of the images that are compressed to 125% of 
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their original size. This anomaly does not significantly influence the experiment, since 
the effected region does not contain the target. 

4. Procedure 

Subject’s complete three sessions with short rest intervals between sessions. Each 
subject is read task instructions and given an opportunity to ask questions. Because there 
are only five subjects, a "randomized block" design is employed where the subjects are 
the blocking variables and images are shown in a random order to control for nuisance 
variables such as a fatigue etc. Blocking reduces variability due to subjects’ 
individual differences so that potential differences in the sensory and scene differences 
can be discerned. (Hayes, 1988) 

In vision research, there are ‘targets,’ which are the objects of interest, or 
‘distracters,’ which are everything else. For this experiment, images containing a signal 
are considered targets and the images where a signal is not present are considered 
distracters. A standard visual search paradigm requires that equal numbers of targets be 
presented in an experiment. Accordingly, one matching distracter image for each scene 
is provided. 

Stimuli are flashed (using a square wave pulse) on the center of the screen in a 10 
cm X 10 cm square and are viewed from a distance of 148 cm, thus subtending a 5.6° x 
5.6° visual area on the retina. The stimulus is present until the subject makes a selection 
or until a maximum of six seconds viewing time has elapsed. The experiment then 


proceeds to the next trial, 200 ms after the response is made. A tone provides feedback 
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when the subject responds incorrectly for the type of image (target/distracter) that is 
presented. 
C. ACCURACY IN IDENTIFICATION TEST 
1. Subjects 
This test 1s divided into two separate tests, simple and complex. The division of 
this test is covered in the stimuli section. 
a. Simple 
The 26 subjects participating in the study are all U.S. Navy active duty. 
Twenty-four of the subjects are enlisted with an average rank between E-3 and E-4. The 
average length of service completed is 4.2 years. Five of the enlisted are female. The 
two officers tested have completed a Department head tour and are both Lieutenant 
Commanders with an average length of service of 14.5 years. All subjects have 
correctable 20/20 vision in both eyes. 
b. Complex 
This experiment uses Ten subjects, seven male and three females. All 
subjects have correctable 20/20 vision in both eyes. 
2. Apparatus 
a. Simple 
The test computers all have 90Mhz Pentium CPUs, with 16Mbits memory 
or better. A 15 inch-color video monitor with .28 dot pitch resolution running at 800 X 
600 display size is used. The test is given at several locations using different machines. 


The user is seated approximately two feet in front of the monitor. The exact distance and 


monitor are not controlled, which may add to the variations in the response. Because of 
the varied equipment in use today by DoD, no standard display is chosen for this 
experiment. A Visual Basic program was written to display images randomly while 
recording the subjects’ accuracy and response time. The images are centered on the 
screen, and the time required to identify the target is recorded in 100 ms intervals. 
b. Complex 
The complex image test uses a single Pentium I 233 computer with a 21” 

NEC MultiSync XE21 monitor. The monitor resolution is set to 1024 X 768 pixels. All 
other aspects of the experiment are the same as the simple image accuracy test. 

3. Stimuli 

This test is further divided into the identification of both an object with a simple 
background, and an object with a complex background. These two tests are referred to as 
the simple images and complex images, respectively. The simple images are composed 
of U.S. Navy ships, both on the open sea and near land. There are seven different types 
of ships. Each image is compressed with RTN, LBR, and ICE at five image qualities. 
Several images are not compressed to every image quality level due to software 
limitations. All the scenes are 800 X 600 pixels in size. The net result of the 
compressions is 377 images (7 ships x 4 views x 3 algorithms x 5 image qualities + 28 
original images — 71 unattainable images). 

The complex images are composed of automobiles. There are five different types 
of cars, and four different views of each car. Each image is compressed with RTN, LBR, 


and ICE at five image qualities. Several images are not compressed to every image 
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quality level due to software limitations. All the scenes are 640 X 480 pixels in size. The 
net result of the compressions is 300 images (5 cars x 4 views x 3 algorithms x 5 images). 

Each original image 1s compressed using the three new compression algorithms at 
five compression ratios. All combinations of algorithm and compression ratios are 
applied to each image to produce fifteen new images. The LBR compression algorithm is 
used first on each image. The LBR is the most limiting of the three algorithms, because 
it provides little control of compression ratios. Inputting an image quality level into LBR 
and RTN indirectly controls the compression ratio of the final image. LBR has only five 
compression qualities; on the other hand, RTN has one hundred. ICE has both an image- 
quality setting and image-compression-ratio setting. ICE can compress images from 2 to 
one, to 100 to 1. The compression ratio is calculated by dividing the original file storage 
size by the storage of the image at each compression quality. The compression ratio is 
used to verify that the final compressed images are of equivalent compression. 
Compressed images are considered equivalent in image quality if their compression ratio 
is within 10 to 1 of each other. 

RTN is applied to the same images that are compressed with LBR. The mean 
compression ratios at each of the five image qualities are calculated. ICE is set to the 
mean compression ratio for each image. Due to ICE being limited to a 100 to 1 
compression ratio, only the mean compression ratios of less than 110 to 1 are compressed 


with ICE. 
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4. Procedure 

The accuracy test is designed to determine the effect each algorithm has on image 
quality. In addition, how the increased compression ratios affect the ability of subjects to 
identify objects in the images is measured. Each subject is shown an image and asked to 
identify the item depicted. The image remains on the screen until the subjects make a 
mouse click on the image. The subject is then presented a list of choices from which the 
subject is asked to indicate the object that is displayed. The answer is recorded along 
with the image viewed. Each subject sees each image in the data set. The images are 
randomized each time by the computer to reduce the possibility of either a learned 
response or an order effect. A blank screen is displayed for 500 ms before displaying the 
next image. 
D. REACTION TIME IN IDENTIFICATION TEST 

1. Subjects 

The subjects that participate in the complex image accuracy test also participated 
in the complex image reaction time test. Likewise, the subjects that participated in the 
simple accuracy test also participate in the simple image reaction time test. 

Zz Apparatus 

The apparatus for the complex image accuracy test is used in the complex image 
reaction time test. Likewise, the apparatus for the simple accuracy test is used by the 


simple image reaction time test. 
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3. Stimuli 
The stimulus for the complex image accuracy test is used in the complex image 
reaction time test. Similarly, the stimulus for the simple accuracy test is used by the 
simple image reaction time test. 
4. Procedure 
The reaction time experiment is run concurrently with the accuracy test. During 
the accuracy test, the program records the reaction time of the subject. The reaction time 
is calculated from the time the image is displayed to the time the subject clicks on the 
image with the mouse. The reaction time is recorded in 100 ms increments. 
FE. PAIRED COMPARISON 
il Subjects 
a. Simple 
Twenty-five subjects, 24 active-duty U.S. Navy personnel and one 
Marine, participate in this portion of the study. All of the subjects are enlisted with an 
average rank between E-3 and E-4. The average length of service completed is 4.0 years. 
Five of the enlisted are female. All subjects have correctable 20/20 vision in both eyes. 
b. Complex 
Nine subjects, seven male and two females, participate in this experiment. 
Eight of the nine subjects are military officers with an average length of service of eight 
years. The eight officers are either of pay grades O-3 or O-4. The remaining subject is 


civilian. All subjects had correctable 20/20 vision in both eyes. 
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2 Apparatus 

The apparatus for the complex image accuracy test is used in the complex image 
reaction time test. Similarly, the apparatus for the simple accuracy test is used by the 
simple image reaction time test. 

Ss Stimuli 

The stimulus for the complex image accuracy test 1s used in the complex image 
reaction time test. Likewise, the stimulus for the simple accuracy test is used by the 
simple image reaction time test. The simple test has additional images added to increase 
the number of images compressed at lower image quality levels for ICE. The additional 
images are of the same type as the original set. Sets of image pairs are made from the 
compressed images. Each pair is of the same scene at the same compression level. Every 
combination of algorithms is used. 

4. Procedure 

The subjects complete two sessions with a short rest interval between sessions. 
The objective of this experiment is to collect a more subjective ranking of the three 
algorithms. A Visual Basic program displays each image in an image set centered on the 
screen for 500 ms in a random order. The screen is blanked for 500 ms between the 
image pairs. The subject is then given the following choices: the first image is better, the 
second image is better; or the images are the same. The image sets are shown in both 


orders of presentation to reduce the effect of an order effect on the outcome. 
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VII. DATA ANALYSIS 


LBR is limited to five preset image qualities, thus it is not possible to compress 
each scene to the same compression ratio. For this reason, most of the data analysis is 
based on the LBR image quality settings. Compressed images are considered equivalent 
in image quality if their compression ratio is within 10 to 1. The difference in 
compression ratios is minimized to reduce any bias towards an algorithm. This 
difference is only a factor in image quality levels one and two; image quality levels three, 
four, and five have compression ratio differences of less than 2 to 1. 

Three independent experiments are conducted. The first experiment collects 
accuracy and reaction time to determine if a target is in the scene. This experiment is an 
initial trial to determine if further testing is warranted. The second experiment alles 
accuracy and reaction time to determine target identification. The final experiment 
collects the subjects’ subjective preference between pairs of images. 

A. DETECTION 

ICE is not available for the detection experiment, therefore only RTN and LBR 
are evaluated. 

1. Accuracy 

A 2x 5x 5x 2 Analysis of Variance (ANOVA) is conducted, where the four 
factors are the type of algorithm (LBR or RTN), the compression ratio (1, 2, 3, 4, 5), the 
identity of the subject (A, B, C, D, E) and the target (target or distracter). The dependent 


variable of the ANOVA is the mean proportion correct of each object type. Stepwise 


sal 


linear regression is used to arrive at final ANOVA model. This model nasibees checked 
to verify that all significant interactions are included in the model. The residuals are also 
checked to verify the ANOVA modeling assumptions. 

The residual values show no trend and equal variance. With the exception of the 
tails, the residuals also appear normally distributed. With the large sample size in this 
experiment, this departure from normality will not effect the interpretation of the 
ANOVA results. As can be seen in Table 8, all four factors, subject, algorithm, 
compression ratio and target are significant. In addition there appears to be important 
two way and three way interactions. A significant main effect for algorithm shows that 
the subjects have fewer errors in identifying targets with the RTN compared to LBR 
(Figure 16). Compression main effect shows that as the image quality decreases 
(compression increasing), the accuracy of identifying targets decreases. The subject main 
effect shows the variability in the ability of subjects to identify targets correctly. The 
type of image, target or distracter, is the most significant main effect. The subjects’ 
ability to identify images correctly is significantly lower for identifying targets than for 
identifying distracters. There are no non-significant main effects. 
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Table 8 ANOVA of Accuracy of Detection 
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Figure 16 Proportion Correct by Each Factor, Detection 

The factors combine with each other in two-way interactions to affect the 
accuracy of identifying targets. A compression by target interaction shows that as the 
compression ratio increases (lower image quality), the subjects identify distracters more 
accurately than targets (Figure 17). The compression by algorithm interaction shows a 
more rapid decrease in accuracy as image quality decreased for LBR compared to RTN 
(Figure 18). The algorithm by target comparison interaction shows a significant 
difference, in that images compressed with RTN are identified more accurately than 
images compressed with LBR (Figure 19). RTN and LBR do not affect the distracters. 
The effect of the algorithm 1s expected from the results of the interactions of the image 


quality and target type with the algorithm. 
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Figure 17 Proportion Correct by Compression Level and Target, Detection 
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Figure 18 Proportion Correct by Compression Level and Algorithm, Detection 
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Figure 19 Proportion Correct by Target and Algorithm, Detection 
The three-way interaction of target and algorithm with compression is significant. 
The plot of this three-way interaction (Figure 20) shows the different effects of target and 
algorithm have on accuracy as compression increases. There is no difference in the 
distracter at any compression ratio. However, the target algorithm lines show there is a 


difference as the compression ratio increases. All other interactions are non-significant. 
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Figure 20 Proportion Correct by Compression Level, Algorithm and Target, 
Detection 


2. Reaction Time 

A 2x5 x5 x 2 ANOVA is calculated where the four factors are the type of 
algorithm (LBR or RTN), the compression ratio (1, 2, 3, 4, 5), the identity of the subject 
(A, B, C, D, E) and the target (target or distracter). The dependent variable of the 
ANOVA is the average reaction time for each object type. Stepwise linear regression 
was used to develop the model used in the ANOVA table. The final model has been 
checked to verify that all significant interactions are included in the model. The ANOVA 


results are shown in Table 9. 
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D&E Sum of Sq Mean Sq F Value Pr (F) 
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Residuals 84 2468559 29388 


Residual standard error: 171.4281 


Table 9 ANOVA of Reaction Time of Detection 





ANOVA requirements of equal variance and normality of the residuals are again 
verified. The residual values show no trend and equal variance, with the exception of the 
tails, the residuals appear normally distributed. Subjects took longer to identify targets 
with RTN, than with LBR (Figure 21). The effect of the subjects shows the variability in 
reaction time of subjects to correctly identify targets. The image type affects the 
subjects’ reaction time in identifying images. The distracters take significantly longer to 


identify than targets. 
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Figure 21 Mean Reaction Time by Each Factor, Detection 

The mean reaction times differed by less than 25 milliseconds from the slowest to 
the fastest compression ratio (Figure 21). This gives the impression that differences in 
compression levels do not effect mean reaction times. However, a significant 
compression by target interaction can be seen (Figure 22). The subjects’ reaction time 
decreases as the compression ratio increases for identifying distracters. Similarly, the 
reaction time increases when identifying targets as the compression ratio increases. The 
algorithm-by-target comparison also shows a significant difference. The reaction time 
increases for RIN compressed images compared to LBR. The reaction time for 


identifying distracters is unchanged by both algorithms (Figure 23). 
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Figure 22 Mean Reaction Time by Compression Level and Target, Detection 
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Figure 23 Mean Reaction Time by Target and Algorithm, Detection 


None of the three-way interactions are significant. 
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B. ACCURACY IN IDENTIFICATION TEST 
3. Simple Images 

Due to ICE being limited to a maximum compression ratio of 100 to 1, all three 
algorithms are only compared at image compression settings four and five. The level of 
accurately identifying ICE images is not considered at image compression level three due 
to having only two observations per subject. A stepwise regression is used to develop the 
initial model for the one-way ANOVA. The initial model is then modified to reflect 
significant interaction terms. 

A 4x5 x 26x 8 ANOVA is calculated where the four factors are the type of 
algorithm (LBR, RTN, ICE and ORG), the image quality (1, 2, 3, 4, 5), the identity of the 
subject (unique number) and the target (0, 2, 3, 4, 5, 6, 8, 9). The dependent variable of 
the ANOVA is the proportion correct of each object type. The final model was checked 


to verify that all significant interactions are included in the model. The ANOVA results 


are shown in Table 10. 
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Table 10 ANOVA of Accuracy of Simple Images 
A significant effect by algorithm shows that there is a difference in the ability to 


identify objects when compressed with different algorithms. From the least effect on 
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accuracy to greatest reduction in accuracy, the algorithms are the original image, ICE, 
RTN, and LBR (Figure 24). The effect of compression levels shows that as the image 
quality decreases (compression increases) the accuracy of identifying objects decreases. 
The main effect of algorithm and image quality can be better seen in Figure 25. The 
effect of object shows that the type of ship has a major effect on accuracy. Subject 
interviews after the experiment reveal that several of the subjects could not identify 
amphibious ships. The subjects show a wide range of ability in identifying ships. This 
result is expected after having reviewed the post-experiment interviews. Over half of the 
subjects participating have no fleet experience and are in the beginning phase of their 


ship identification training. 
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Figure 24 Proportion Correct by Factor, Simple Background 
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Figure 25 Proportion Correct by Algorithm and Image Quality Factors, Simple 
Background 


Four of the six possible interactions between algorithm, image quality, subject, 
and object are significant. A significant algorithm by object interaction shows that the 
compression algorithm affects the accuracy of doaiteatics objects depending on the 
object type (Figure 26). The image quality by object interaction shows that not all 
objects compress equally (Figure 27). The varied level of image compression affects the 
subjects’ accuracy in identifying the objects. The general trend is that the uncompressed 
images are identified with the highest accuracy. The most compressed images (lowest 
image quality) have the lowest identification rate. This general decrease is expected. 

Additionally, the addition of the uncompressed images shows that the algorithms 
used to compress the images are not solely responsible for the subjects’ inability to 
identify the ships. A significant subject-object interaction shows that there is a wide 


range of ability in identifying different target types by the subjects. This large difference 
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is attributed to the disparity between the experienced and non-experienced subjects 
discussed under main effects. The graph shows no obvious trends and is therefore not 


included. 
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Figure 26 Proportion Correct by Algorithm and Object, Simple Background 
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Figure 27 Proportion Correct by Image Quality and Object, Simple Background 

There is no a significant interaction between the compression algorithm and the 
compression ratio (image quality), but the algorithms did affect the accuracy. The 
differences in the interaction between algorithm and compression ratio are used to help 
determine which compression algorithm will be used in the future. The general trend is 
that images compressed with ICE are identified with greater accuracy for image qualities 
four and five (least compressed) (Figure 29). At image quality settings one through four, 
images compressed with RTN are more accurately identified than images compressed 


with LBR. 
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Figure 28 Proportion Correct by Image Quality and Algorithm, Simple Background 
A pair comparison using Bonferroni’s method was used to check for significance 


between RTN and ICE. Using a 0.01 level of significance (Z, =2.7131) there is 
3 


Significance only at image quality level three (Table 11). A 0.05 level of significance 


(Z, =3.1280 ) does not alter the results of the paired comparison. 
3 


image quality level | test stat 





Table 11 Test Statistics for Paired Test of Accuracy 
1. Complex Images 
Due to the image complexity, the maximum compression ratio achieved by all the 
algorithms is greatly reduced compared to the simple images. ICE’s limitation of a 


maximum compression ratio of 100 to 1 allowed for testing all three algorithms down to 
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image quality two. Again only RTN and LBR are compared at the lowest image quality 
(1), the highest compression ratio. A stepwise regression is used to develop the initial 
model for the one-way ANOVA. The initial model is then modified to reflect significant 
interaction terms. 

A 3 x 6x 10 x 5 ANOVA is caculated where the four factors are the type of 
algorithm (LBR, RTN, and ICE), the image quality (1, 2, 3, 4, 5, 6), the identity of the 
subject (unique number) and the target (SU, SE, MV, SP, ST). The dependent variable of 
the ANOVA is the proportion correct of each object type. The final model has been 
checked to verify that all significant interactions are included in the model. The ANOVA 
results are shown in Table 12. 
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Table 12 ANOVA of Accuracy of Complex Images 
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A significant effect of algorithm shows there is a difference in the subjects’ ability 
to identify objects when compressed with different algorithms. From the least effect on 
accuracy to greatest reduction in accuracy, the algorithms are ICE, RTN, and LBR 
(Figure 29). The effect of compression levels shows that as the image quality decreases 


(compression increasing) the accuracy of identifying objects decreases. The image 
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quality level zero is formed from a set of RTN and ICE images.~ Level zero images have 
an image quality between one and two. The effect of object shows that the object type 
had an effect on accuracy. The subject main effect shows that the subjects had an effect 
on accuracy. The subject effect on accuracy is less significant with the complex images 
due to the object selection for the test. To reduce the subject effect, easily identifiable 
cars are used as target objects. However, one of the cars is noticeably harder to identify 


(ST). 
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Figure 29 Proportion Correct by Factor, Complex Background 
A significant algorithm by object interaction shows the effect of the object type 
on accuracy of identifying that type (Figure 30). The image quality by object interaction 
shows that not all objects compress equally (Figure 31). These varied levels of image 


compression effected the accuracy of the subjects in identifying the objects. The general 


? These images would have been image quality one except that software limitations prevented the images 
from being compressed to the equivalent level of the LBR images. 
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trend 1s that the most compressed images (lowest image quality) have the lowest rate of 


identification. This general decrease is expected. 
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Figure 31 Proportion Correct by Image Quality and Object, Complex Background 
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A. significant subject-object interaction shows that there is a significant 
differences among the subjects’ abilities to identify different target types. The graph 
shows no obvious trends and is therefore not included. A significant interaction between 
the compression algorithm and the image quality shows that as image quality decreases 
(compression ratio increasing) accuracy also decreases (Figure 32). No discernable 
trends are evident in the remaining interaction terms and are not shown. A paired 
comparison gives a test statistic less then 1.6 at all levels between RTN and ICE. The 
critical Z value is 2.8070 at a 0.01 level of significance and 2.2414 at a 0.05 level of 


significance. 
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Figure 32 Proportion Correct by Image Quality and Algorithm, Complex 
Background 





A linear regression is calculated for each algorithm against compression ratio 
(Figure 33). The regression shows that at compression ratios greater than 48 to 1 the 


subjects prefer ICE to RTN. Additionally, the subjects prefer RTN to LBR. The order of 


69 


preference is reversed at compression ratios less than 48 to 1. One of the images 
compressed with ICE is missed more often at the lower compression ratio than the higher 
compression ratio. The most likely reason is that there was a learning effect. However, 


the images are shown in a random order to reduce the learning effect. 
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Figure 33 Linear Regression of Compression Ratio on Accuracy, Complex Images 





Cc. REACTION TIME IN IDENTIFICATION TEST 

1. Simple Images 

A 3 x 26 x 8 ANOVA is conducted where the four factors are the type of 
algorithm (ORG, LBR, RTN, and ICE), the identity of the subject (unique number) and 
the target (0, 2, 3, 4, 5, 6, 8, 9). The dependent variable of the ANOVA is the average 


reaction time for each object type. The final model was checked to verify that all 
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Significant interactions are included in the model. The ANOVA results are shown in 


Table 13. 


DE Sum of Sq Mean Sq F Value Pr (F) 

Subject (s) 25 OCS 3eo040. 23500 6 slZ2684 0 0000000 
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S anda 75 651.82 "69087 1.471559 0.0057007 
S and o 175 1328.19 .58967 1.285100 0.0085668 
Residuals 2496 14741.12 .90590 


Residual standard error: 2.430205 


Table 13 ANOVA of Reaction Time of Simple Images 





A significant effect for subject shows that there is a difference in subjects’ 
reaction time (Figure 34). This difference in subject ability is discussed in Chapter VII 
section B sub-section 2. The effect of object shows that the type of ship had a large 
effect on reaction time. Both algorithm and compression ratio (image quality) are non- 
significant main effects. However, algorithm is left in because of significant interaction 


terms involving algorithm. 
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Figure 34 Mean Reaction Time by Factor, Simple Background 
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Three main factors (algorithm, object, and subject) combined with each other in 
two-way interactions to affect the reaction times. A significant algorithm by subject 
interaction, and a subject by object interaction can be seen. No trend in the data can be 
discerned, so the graphs are not included. There are no other significant interactions. 
There is no trend observed in any of the reaction time data. 

The only exception is at image quality three, the reaction time significantly 
increased for ICE (Figure 35). The sample size is ten percent of both RTN and LBR. 
Each subject saw only three images compressed with ICE at image quality three. A 
paired comparison using Bonferroni’s method was used to check for significance 


between RTN and ICE. Using a 0.01 level of significance (Z, =2.7131) there is 
BS 


significance only at image quality level three (Table 14). A 0.05 level of significance 


(Z, = 3.1280) does not alter the results of the paired comparison. 
3 





image quality level | test stat 


4.0505 


| 4.0505 _ 
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__0.0874 | 


0.0874 
Table 14 Test Statistics for Paired Test of Reaction Time 
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Figure 35 Reaction Time by Algorithm, Simple Images 
ey Complex Images 
A 3x 6x 10 x 5 ANOVA is calculated where the four factors are the type of 
algorithm (LBR, RTN, and ICE), the image quality (1, 2, 3, 4, 5, 6), the identity of the 
subject (unique number) and the target (SU, SE, MV, SP, ST). The dependent variable of 
the ANOVA is the average reaction time of each object type. The final model has been 
checked to verify that all significant interactions are included in the model. The ANOVA 


results are shown in Table 15. 
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D£ Sum of Sq Mean Sq F Pr (F) 
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Object (0) 4 oreo sOe 4a, O27 00S. lis .0000000 


s anda 18 7.4704 0.415024 é 0271356 
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Residual standard error: 0.486552 


Table 15 ANOVA of Reaction Time of Complex Images 





A non-significant effect for algorithm shows there is no difference in the reaction 
times when images are compressed with different algorithms (Figure 36). Algorithm is 
left in the model because of significant interaction terms involving algonthm. The effect 
of compression levels shows that as the image quality decreases (compression increasing) 
the reaction time of identifying objects increases. The image quality level zero is 
discussed in Chapter VII section B sub-section 2. The effect of object shows that the 
object type had an effect on reaction time. The subject main effect shows the subjects 
had an effect on reaction time. The subject effect on reaction time is less with the 
complex images due to the object selection for the test as discussed in Chapter VI 


section B sub-section 2. 
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Figure 36 Main effects on Time 





A significant subject algorithm interaction shows there is a significant difference 
in the subjects’ reaction time in identifying different objects when compressed with the 
different algorithms. The graph shows no obvious trends and is therefore not included. 
The algorithm image quality interaction is significant. Additionally, a paired comparison 
gives a test statistic of less then 2.04 at all levels between RTN and ICE. The critical Z 
value is 2.8070 at a .01 level of significance and 2.2414 at a 0.05 level of significance. 
The algorithm and the compression ratio (image quality) interaction shows that as 
compression level increases (image quality decreasing) the reaction time increases 
(Figure 37). The three-way interaction between algorithm, compression level and object 


type was significant, but is not shown here. 
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Figure 37 Reaction Time by Algorithm, Complex Images 

D. PAIRED COMPARISON 

In this analysis, subjects are shown two images and asked which 1s better, or are 
they the same. These pairwise comparison are evaluated using a Bradley-Terry model 
(David, 1988). The Bradley-Terry model is modified to take into account the option of 
the pair of images being judged the same. The model used in this thesis also contains a 
weight (wt) that represents the bias towards the first image presented. A weight greater 
than one indicates the first image is preferred. A weight less than one indicates the 
second image presented is preferred. This modified Bradley-Terry model generates 
scores (s) with the most preferred algorithm receiving the highest score. The scores are 
odds in the sense that the probability that algorithm / is preferred to algorithm j and 7 is 


wt * Sj 
presented first: ————— . 
wt * S; Gt Sj 
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The inputs for the model are two 3 by 3 matrices. The (i, ae ew of the first 
matrix (A) is the total number of times that algorithm / is preferred to algorithm j. If the 
two are judged the same then one half is added to both the (i, mr and (j, ae elements. No 
image is ever compared to itself. The second matrix (A1) is the total number of times the 
algorithm presented first is preferred over the one presented second. Again, one half is 
added to each element in the case of the algorithms being judged equal. 

For example, Table 16 contains the raw data from a comparison of the three 
algorithms. The table shows that if ICE is presented first, it is preferred 60 times when 
compared against LBR (First Image Preferred). If LBR is presented first, ICE 1s 
preferred 95 times (Second Image Preferred). ICE and LBR are judged equal 68 times 


when ICE is displayed first, and 70 ttmes when LBR 1s displayed first (Images Equal). 





Table 16 Raw Data Matrixes 


The columns labeled A in Table 17 show the sum of the times ICE is preferred over LBR 
plus .5 for each time they are judged equal (60+ 95+[68+70]*.5 = 224). Of these 224 
instances, 129 come from tnals where ICE is_ displayed before LBR 


(60+[68 + 70]*.5 =129). 





Table 17 Input Matrixes 


ie; 


The results of the Bradley-Terry model from Table 17 are given in Table 18. The weight 
value is the bias value. The number ts greater than one, therefore the first image is more 
likely to be preferred than the second image regardless of the algorithm. In this case, ICE 


is the preferred algorithm followed by LBR, then RTN. 


|| ICE | LBR | RTN_ 
| weight | 1.551] | 


Table 18 Bradley-Terry Sample Results 










1. Simple Images 

Image quality one is entered into the Bradley-Terry model as a 2 by 2 matrix with 
RTN and LBR only. Table 19 shows the results from the simple images used in a 
pairwise comparison. The results are represented graphically in Figure 38. All numbers 
are rounded to three decimal places. RIN is preferred over LBR at all but the least 
compressed level. ICE is preferred over RTN and LBR at all image quality iegels where 
ICE is present. As the image quality decreased and compression increased, the bias 


towards the first image being preferred increased. 


ee er ea 
Table 19 Bradley-Terry Simple Image Scores 
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Pefered Algorithm Simple Images 





image Quality 


Figure 38 Pairwise Comparison Scoring of Simple Images 

De Complex Images 

ICE is not compared against RTN and LBR due to the software limit at the lowest 
image quality, or highest compression ratio. ICE is preferred over both RTN and LBR in 
image qualities five down to quality two (Table 20). As the image quality decreased, ICE 
is preferred at an increasing rate (Figure 39). The bias towards favoring the first image 
presented increased again as the image quality decreased. There is one exception to the 
increasing bias: at image quality four, the bias dropped slightly. 

RTN is judged better than LBR at all but the highest image quality. The analysis 
of the comparison between RTN and LBR is done as a 2 by 2 matrix with 4 points added 
to both the A and Al matrixes. The diagonals of both A and Al matrixes are set to zero. 
The addition is made to lower the bias towards the first image. The addition of the 


original values does not change the overall ranking significantly. The change in ranking 
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score is less the 0.001 as the addition value is changed. This addition does reduce the 


bias toward the first image; 10.140 is a lower bound for the true value. 


a eo S638 oa moon 
a ae eee ee 

















Figure 39 Pairwise Comparison Scoring of Complex Images 
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VII. CONCLUSION 


This thesis compares the three lossy algorithms Interim Low Bit Rate, Titian ICE, 
and Radiant TIN at compression ratios greater than the current DoD compression 
standard (NITF) can achieve. The goal of this thesis is to determine whether LBR or ICE 
performs significantly better than RTN. Three experiments are conducted to determine 
the effect of these compression algorithms on target detection, target identification, and 
reaction time. Additionally, the subjects are shown two compressed images and asked to 
select a preferred image. All three algorithms compress images to at least a 100 to 1 
compression ratio, exceeding NITF’s maximum compression ratio. 

The algorithm and compression ratio did not affect the identification of ships in a 
simple background. There are large individual differences in the respective subjects’ 
abilities to identify the ships. In attempting to identify ships, an increase towards greater 
accuracy could be gained by increasing training of the subjects. Additional testing is 
recommended with simple image backgrounds using commonly known target objects, 
such as cars, to reduce the effect of subject training and knowledge on the accurate 
identification of targets. The identification of cars in the complex background shows that 
ICE performs best followed by RTN then LBR. However, there is no statistically 
significant difference between ICE and RTN. RTWN consistently performs better than 
LBR in target detection. It also performs better than LBR in the identification and 
subjective rankings of both the complex and simple images. There is no statistically 


significant difference in reaction times based on compression algorithms. The only result 
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consistent throughout the reaction time testing is that as the compression ratios increase 
the subjects’ reaction times slowed. 

The results of subjective rankings of image quality are consistent across both the 
simple and complex background test images. ICE is the algorithm subjectively preferred 
over either of the other algorithms at all compression ratios. RTIN is consistently 
preferred over LBR with the exception of the lowest compression ratios. This effect is 
also observed in the accuracy testing. The difference is LBR’s ability to compress 
images with less noticeable changes at the lowest compression ratios and thus does not 
invalidate the overall preference of RTN. 

ICE’s overall performance is better than RTN; however the difference between 
the two algorithms is not statistically significant. Additionally, the ICE compression 
software is limited by its graphical user interface of compression ratios of 100 to 1; RIN 
does not have this limitation. Furthermore, the Navy already has the proprietary rights to 
RTN and would not have to purchase a new algorithm. RTN is the recommended 
compression algorithm. Any future testing should include the NITF 2.0 standard that was 
released at the completion of this study, and ICE should be reevaluated if the 100 to 1 


software limitation is removed. 
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APPENDIX A. TEST IMAGES 


The following are the images for the accuracy and reaction time test. Both the 


simple background and complex background image sets are included. 


A. SIMPLE IMAGES 
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B. COMPLEX IMAGES 
The following images were used in the complex background accuracy and 


reaction time test. 


87 


at FY tae 2d 
ne MESA ey 


ys 
Le it 


a 
; 5 
+ 


- 
a7 


i 
A” 
Wyte 
sink a 
: 1a, 


> f 


> 





» & 
44h 
eS ta + eek ¥ 


— Mini Van (object MV) | Mini Van (object MV) 


a fyepe a) 









in on 4%, ip a 





too ot td eatTe” | 

em 4 wig Hive q Ky 
tat haat: . : 

+ §, he 

b, ae at 


wey 


mst 


at 

tke 

WY Sow ~ ae 
of OER 
he” Timet 







ae bal ‘ Pea 7 te 1. 
Mint Van (obtect MV) Mini Van (object MV 


ee 











, Sap ie “spl + ‘$ 
. ¥ 
mi R 5 ° 


Soot 
“~~ 
ee 


¥ 
aa NI xn 


yn & 


A ates IST", Boas « . 
* 


ee 


Seg 


Y 


Se 


a . 

c q 6s Pe . sal E 
bat Peng 
ae 

ally haere oS 


, 


penetra ron irirtit 


* ¥ 


We 
SO wy 


4 





sie rn dP Sa cr PAA SED ttan A eDNn Ah sin hetdatnahittmtarimrees 





CA esate ee 
Br ee 
Sedan (object SE) 





Sedan (object SE) 








SS peMman n abe —n ae ea  le 


88 


Pi ve, 
x Kes 
% aes ee ae 
4A Foe rg fteds hi 


>) ga 
3, 
— ae he <5 ao % 


z me 
ef 3h w 
a pa 
; 


e 


a? rare 
fsgtay. 2% 


= ian Sg 


| ————— cy (OReee se i. Sports Car (object SP) 





89 


» 
t 


e . he Q 
ine 
a . te Fa 
ae 


3° 


a Oia «OR 
a) 


Cer 1 Oh Ne 

+ AALS eee 
a aft by 

2 * *f 

i: 


K% 


a ¥ ’ 
Fone a. Bags 
Sh 


a 3 € 
Fe gf 


» ant 
pe a FE 


ai 


(obiec 


a ob 
| | as 





90 


_> 


zr 
ed 
& 


pS 


Poet 
hyasce x! at 


y Pea ipa Rly eo 


ports Uulity Vehicle (SU 





) 


Sa pee ete 





en 
iar 


~ ~ ' = eo = ta 2 
ww ot whD Ra i SO BP <hr BAe 





Sports Utility Vehicle (SU) 


9] 





APPENDIX B. BRADLEY-TERRY MODEL 


The following code is the code that was used in S-Plus 4.0 to implement the 
Bradley-Terry pairwise comparison model. 


function(A, Al, tot = 10, add = 0) 
{ 
# fname is wts1 
# A is the win-loss matrix; Al is win-loss matrix for the 
# row index being the host team or the first presented object; 
# w1 is the set of Ford weights w/o consideration for home team 
# or order of presentataion; w is the weight set adusted for such 
# considerations and then th is the odds multiplier for first 
# presentation or host team.. 
# weights scaled so that total weight is 10. This may be over- 
# ridden in the call by using another value. 
k <- dim(A){1] 
mat <- matrix(1, k, k) - diag(k) 
A <- A+ add * mat 
Al <- Al + 0.5 * add * mat 
AA <- apply(A, 1, sum) 
N <-A+t(A) 
w <- AA/apply(N, 1, sum) 
Ww <- (tot * w)/sum(w) 
ep <- le-006 
wO0 <- w # now we can find the raw Ford weights 
repeat { 
Ww <- W 
tt <- outer(w, w, "+") 
w <- AA/apply(N/tt, 1, sum) 
w <- (tot * w)/sum(w) 
if(max(abs(w - ww)) < ep) 
break 
} 
# the raw weights are uSe as a ‘warm’ start 
# for the adjusted weights 
wl <-w 
th <- | 
All <- sum(Al1) 
N1 <- Al +t(A - Al) 
repeat { 


93 


thO <- th 

WW <- W 

NN <- outer(th * w, w, "+") 

tt <- apply(N1/NN, 1, sum) 

th <- Al1/sum(w * tt) 

tt <- tt * th 

w <- AA/tt 

w <- (tot * w)/sum(w) 

if(abs(th - thO) < ep) { 
if(max(abs(w - ww)) <ep) { 

break 

} 

} 

# if((abs(th - thO) < ep) && (max(abs(w - ww)) < ep)) break 
} 


# output has three rows. The first contains the raw weights, 

# the second contains the adjusted weights, the third has theta. 
z <- rbind(wl, w, c(th, rep(O, k - 1))) 
Z 
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